HBASE-30115 Introduce approximate progress estimation for TableRecordReader based on row key position#8134
HBASE-30115 Introduce approximate progress estimation for TableRecordReader based on row key position#8134jinhyukify wants to merge 5 commits intoapache:masterfrom
Conversation
…Reader based on row key position
|
Thanks, here are my initial thoughts. Pluggable
|
|
@junegunn Thank you for your feedback. Pluggable
|
|
I've just fixed test failures in |
There was a problem hiding this comment.
Pull request overview
This PR (HBASE-30115) adds an approximate progress estimation mechanism for TableRecordReader by mapping the last-read row key into a normalized fraction of the scan’s start/stop key range. This improves MapReduce task progress reporting without requiring tuple counting.
Changes:
- Added a pluggable
RowKeyProgressinterface with default (UniformRowKeyProgress) and hex-specific (HexStringRowKeyProgress) implementations. - Updated
TableRecordReaderImpl#getProgress()to return an estimated fraction based on the last successfully read row key (with optional probing for empty start/stop bounds). - Added unit tests for both progress implementations.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java | Initializes a RowKeyProgress estimator and uses it to report approximate scan progress; includes probing logic for empty bounds. |
| hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowKeyProgress.java | Introduces the progress-estimation SPI and configuration key. |
| hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/UniformRowKeyProgress.java | Default progress estimator treating row keys as big-endian unsigned byte sequences. |
| hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HexStringRowKeyProgress.java | Progress estimator for ASCII hex-encoded row keys (e.g., hash prefixes). |
| hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestUniformRowKeyProgress.java | Unit tests for UniformRowKeyProgress. |
| hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHexStringRowKeyProgress.java | Unit tests for HexStringRowKeyProgress. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Jira https://issues.apache.org/jira/browse/HBASE-30115